<<<<<<< HEAD ======= >>>>>>> acc04d98d5bfe4ca322513e908756d502a7488a5 <<<<<<< HEAD
=======
>>>>>>> acc04d98d5bfe4ca322513e908756d502a7488a5

Knee Replacement Surgery Outcome

Introduction

This research aims to determine if the postoperative function of a total knee replacement recipient can be modeled based on anthropomorphic data and details of the surgical procedure. We decided to carry out testing on models with two postoperative scoring systems as the response variable. The first score is the “Surgeon’s Score” which measures more objective postoperative metrics about the condition of the replaced knee following surgery. These metrics include: how straight or how flexed the patient can make thier knee, how well aligned the knee appears when inspecting from a frontal perspective and how much laxity or separation can be achieved by placing the lower leg under tension. Finally, the patient is asked about the overall pain they experience with the knee. The resulting score is a value between 0-100 with 80-100 being an excellent result.

The second score we model is the “Patient’s Score” which is generated by three questions: how far are they able to walk, how well can they navigate stairs and do they require any walking aids (crutches, cane, etc).

We hope to create a model that can predict each of the scores based on a variety of categorical and numeric predictors.

Background information

This data has been collected from a Southern California Hospital in collaboration with the Shiley Center for Orthopedic Research. The data is used for both clinical research and postoperative monitoring of implant survivorship. All patient specific information has been removed from the data so it adheres to the Health Insurance Portability and Accountability Act (HIPAA).

Statement

We hope to determine the impact a particular surgeon or the type of implants used has on the patient’s outcome after surgery. We are also interested in exploring the potential relationship between anthropomorphic factors and the patient’s postoperative satisfaction. Finally, we would like to understand if the considerable expense of computer navigation in knee replacement leads to better postoperative scores. Nevertheless, we keep our option opened to investigate other aspects that were not ‘visibile’ or never thought about it but might reveal themselfs as we move on with the project.

Here we may consider KneeScore_Surgeon or KneeScore_Patient as response and Age, Gendar, Weight, Height, BMI, Race, Side, Manufacturer as predictors. Gendar, Race, Side and Manufacturer are some of the categorical predictors and Age, Height, Weight and BMI are numerical predictors.

knee = read.csv('knees.csv')
knee$Surgeon = as.factor(knee$Surgeon)
knee$Year = as.factor(knee$Year)

Functions for model evaluation

plots = function(model) {
  par(mfcol=c(2,2))
  qqnorm(resid(model), col = red, pch = 16)
  qqline(resid(model), col = lightblue, lwd = 2)
  plot(fitted(model), resid(model), xlab = 'Fitted Values', ylab = 'Residuals', main = 'Fitted vs. Residuals Plot', col = purple, pch = 16)
  abline(h = 0, col = green, lwd = 2)
  hist(resid(model), col = blue, main = 'Histogram of Residuals', xlab = 'Residuals')
}

evaluate = function(model){
  # Run each of the tests
  lambda = 1.000001 # this is a hack so the function will return a CrossValidation score - HatValues of 1.0 are causing a division by zero error.
  shapiroTest = shapiro.test(resid(model))
  bpTest = bptest(model)
  rSquared = summary(model)$r.squared
  adjRSquared = summary(model)$adj.r.squared
  loocv = sqrt(mean((resid(model) / (lambda - hatvalues(model))) ^ 2))
  large_hat = sum(hatvalues(model) > 2 * mean(hatvalues(model)))
  large_resid = sum(rstandard(model)[abs(rstandard(model)) > 2]) / (length(rstandard(model) + lambda))
  
  
  # Collect tests in dataframe
  values = data.frame(Result = c(prettyNum(shapiroTest$p.value), prettyNum(bpTest$p.value), prettyNum(rSquared), prettyNum(adjRSquared),prettyNum(loocv),               prettyNum(large_hat))) 
  rowNames = data.frame(Test = c('Shapiro Wilk pvalue', 'Breusch-Pagan pvalue', 'R Squared', 'Adj R Squared', 'Cross Validation', 'Large Hat Values'))

  summary_output = cbind(rowNames, values)
  show(summary_output)
  plots(model)
}

removeInfluential = function(model, data){
  large_lev_index = which(hatvalues(model) > 2 * mean(hatvalues(model)))
  cooks = which(cooks.distance(model) > 4 / length(cooks.distance(model)))
  rStandard = abs(rstandard(model))
  rStandardIndex = which(rStandard > 2)
  index_ = intersect(intersect(large_lev_index, cooks), rStandardIndex)
  newData = data[-index_ , ]
  return(newData)
}

IdentifyInfluential = function(model, data){
  large_lev_index = which(hatvalues(model) > 2 * mean(hatvalues(model)))
  cooks = which(cooks.distance(model) > 4 / length(cooks.distance(model)))
  rStandard = abs(rstandard(model))
  rStandardIndex = which(rStandard > 2)
  index_ = intersect(intersect(large_lev_index, cooks), rStandardIndex)
  return(index_)
}
removeLargeHatValues = function(model, data){
  largeHat = which(hatvalues(model) > 2 * mean(hatvalues(model)))
  newData = data[-largeHat, ]
  return(newData)
}

prediction_accuracy = function(model, data, margin) {
  upper = data + margin
  lower = data - margin
  prediction = predict(model, newdata = data)
  return(sum(prediction < upper & lower < prediction)) / length(prediction)
}

Knee Score Surgeon

<<<<<<< HEAD

Begin by fitting a full additive model as a baseline and evaluating.

full_additive_model = lm(KneeScore_Surgeon ~ ., data = knee)
evaluate(full_additive_model)
##                   Test       Result
## 1  Shapiro Wilk pvalue 2.080142e-08
## 2 Breusch-Pagan pvalue    0.9192514
## 3            R Squared    0.2718411
## 4        Adj R Squared    0.2130606
## 5     Cross Validation      14.4126
## 6     Large Hat Values          177

The baseline model has many large leverage values, which, upon inspection may be coming from several of the implant fields. Perhaps removing the suspect implant predictors will help improve the model.

model_eight = lm(KneeScore_Surgeon ~ . - Diagnosis - Race - TibialInsertType - FemoralComponentModel - PatellaModel - TibialTrayModel - TibialTraySize - FemoralComponentType, data = knee)
evaluate(model_eight)
##                   Test       Result
## 1  Shapiro Wilk pvalue 3.989833e-07
## 2 Breusch-Pagan pvalue    0.1270351
## 3            R Squared    0.2374036
## 4        Adj R Squared    0.2084443
## 5     Cross Validation     14.34844
## 6     Large Hat Values          126

anova(model_eight, full_additive_model)
## Analysis of Variance Table
## 
## Model 1: KneeScore_Surgeon ~ (Surgeon + Year + Age + Gender + Weight + 
##     Height + BMI + Diagnosis + Race + Side + KneeScore_Patient + 
##     GPS + Manufacturer + FemoralComponentModel + FemoralComponentSize + 
##     FemoralComponentType + TibialTrayModel + TibialTraySize + 
##     TibialInsertModel + TibialInsertWidth + TibialInsertType + 
##     PatellaModel + PatellaDiameter) - Diagnosis - Race - TibialInsertType - 
##     FemoralComponentModel - PatellaModel - TibialTrayModel - 
##     TibialTraySize - FemoralComponentType
## Model 2: KneeScore_Surgeon ~ Surgeon + Year + Age + Gender + Weight + 
##     Height + BMI + Diagnosis + Race + Side + KneeScore_Patient + 
##     GPS + Manufacturer + FemoralComponentModel + FemoralComponentSize + 
##     FemoralComponentType + TibialTrayModel + TibialTraySize + 
##     TibialInsertModel + TibialInsertWidth + TibialInsertType + 
##     PatellaModel + PatellaDiameter
##   Res.Df    RSS Df Sum of Sq      F Pr(>F)
## 1   1896 379610                           
## 2   1821 362468 75     17142 1.1483 0.1845

While removing the predictors doesn’t yeild a definatively better model at a reasonable \(\alpha\), it does increase Adjusted \(R^2\) and slightly improves the cross validation score.

Utilize a Bayesian Information Criterion approach and evaluate the selected model.

n = length(resid(full_additive_model))
model_back_bic = step(full_additive_model, direction = "backward", k = log(n), trace = 0)
evaluate(model_back_bic)
##                   Test       Result
## 1  Shapiro Wilk pvalue 1.169168e-05
## 2 Breusch-Pagan pvalue 0.0001309477
## 3            R Squared    0.1869122
## 4        Adj R Squared    0.1852562
## 5     Cross Validation     14.37628
## 6     Large Hat Values           87
anova(model_back_bic, full_additive_model)
## Analysis of Variance Table
## 
## Model 1: KneeScore_Surgeon ~ Age + Side + KneeScore_Patient
=======
knee_model = lm(KneeScore_Surgeon ~ ., data = knee)
summary(knee_model)
## 
## Call:
## lm(formula = KneeScore_Surgeon ~ ., data = knee)
## 
## Residuals:
##     Min      1Q  Median      3Q     Max 
## -43.928  -8.832  -0.234   7.798  49.165 
## 
## Coefficients: (21 not defined because of singularities)
##                                                             Estimate
## (Intercept)                                                 35.69897
## Surgeon2                                                    -0.45208
## Surgeon3                                                    -1.09966
## Surgeon4                                                    -2.11622
## Surgeon5                                                    -3.71883
## Year2011                                                     1.77031
## Year2012                                                     2.02319
## Year2013                                                    -0.52894
## Year2014                                                     3.04851
## Year2015                                                     2.75161
## Year2016                                                     4.20897
## Age                                                          0.28554
## GenderMale                                                   1.06471
## Weight                                                       0.05721
## Height                                                      -0.21788
## BMI                                                         -0.38019
## DiagnosisInflammatory  non-infection                        -9.50692
## DiagnosisInflammatory non-infection                        -13.35023
## DiagnosisOsteoarthritis                                      0.77431
## DiagnosisPaget's disease                                    18.53356
## DiagnosisPost-trauma                                        -4.33767
## DiagnosisPsoriatic Arthritis                                -8.02204
## DiagnosisRheumatoid arthritis                                7.64676
## DiagnosisRheumatoid arthritis, Inflammatory  non-infection   0.39754
## RaceBlack or African American                               -4.68131
## RaceHispanic                                                -8.44185
## RaceIndian Sub-Continent                                   -19.33729
## RaceMid-East Arabian                                        -2.16159
## RaceMultiple Races                                          -2.87781
## RaceOther                                                   -9.18859
## RacePacific Islander                                       -18.94367
## RaceUnknown                                                 -7.09720
## RaceWhite                                                   -7.75157
## SideLeft                                                     9.63617
## SideRight                                                    9.49269
## KneeScore_Patient                                            0.25771
## GPSYes                                                       0.32214
## ManufacturerSmith & Nephew                                  -3.85441
## ManufacturerStryker                                         -7.10653
## ManufacturerZimmer                                           5.48892
## FemoralComponentModelGenesis II                             -3.62682
## FemoralComponentModelJourney                                26.67707
## FemoralComponentModelLegion                                  1.03307
## FemoralComponentModelNatural Knee II                        -1.46555
## FemoralComponentModelNexgen                                 37.16568
## FemoralComponentModelNexgen Legacy                          24.09579
## FemoralComponentModelNon-porous                             -0.44141
## FemoralComponentModelNonporous                             -21.34760
## FemoralComponentModelPersona                                -9.23130
## FemoralComponentModelPFC Sigma                             -12.85311
## FemoralComponentModelSigma                                 -13.79375
## FemoralComponentModelSigma TC3                             -39.89483
## FemoralComponentModelTriathlon                               4.45165
## FemoralComponentModelTriathlon Total Stabilizer            -15.94065
## FemoralComponentModelZimmer Precoat                         27.95585
## FemoralComponentSize10                                       4.83795
## FemoralComponentSize12                                     -12.15381
## FemoralComponentSize2                                        2.25157
## FemoralComponentSize2.5                                      2.33940
## FemoralComponentSize3                                        4.99422
## FemoralComponentSize3N                                      14.71634
## FemoralComponentSize4                                        4.10799
## FemoralComponentSize4N                                       6.15288
## FemoralComponentSize5                                        3.40948
## FemoralComponentSize5N                                       4.87539
## FemoralComponentSize6                                        1.73559
## FemoralComponentSize6n                                       8.34940
## FemoralComponentSize6N                                       4.61078
## FemoralComponentSize7                                       -0.64904
## FemoralComponentSize7+                                      -1.16563
## FemoralComponentSize73                                       0.28220
## FemoralComponentSize8                                        1.13364
## FemoralComponentSize8+                                      -3.39821
## FemoralComponentSize9                                       -4.71932
## FemoralComponentSizeD                                      -11.53379
## FemoralComponentSizeE                                        0.51213
## FemoralComponentSizeE -                                           NA
## FemoralComponentSizeE-                                     -41.42820
## FemoralComponentSizeF                                        0.36934
## FemoralComponentSizeG                                      -29.00815
## FemoralComponentSizeH                                      -22.80284
## FemoralComponentTypecruciate retaining                      -4.18209
## FemoralComponentTypeposterior stabilized                    -5.41915
## TibialTrayModelGenesis II                                   -0.26612
## TibialTrayModelJourney                                            NA
## TibialTrayModelLegion                                             NA
## TibialTrayModelM.B.T DePuy                                  21.46956
## TibialTrayModelM.B.T. Revision                              25.70339
## TibialTrayModelMBT Keel                                     16.37305
## TibialTrayModelModular                                      -7.40977
## TibialTrayModelNatural-Knee II                                    NA
## TibialTrayModelNexgen                                      -23.12509
## TibialTrayModelNexgen Legacy                                      NA
## TibialTrayModelPersona                                            NA
## TibialTrayModelPFC Sigma                                   -12.04015
## TibialTrayModelPFC Sigma,Modular                                  NA
## TibialTrayModelTriathlon                                          NA
## TibialTraySize10                                           -25.22325
## TibialTraySize2                                             -6.33588
## TibialTraySize2.5                                           -4.22621
## TibialTraySize3                                             -6.40416
## TibialTraySize3/4/2016 0:00                                -14.20771
## TibialTraySize32                                            23.28200
## TibialTraySize4                                             -6.01119
## TibialTraySize5                                             -5.84794
## TibialTraySize5/6/2016 0:00                                  7.54123
## TibialTraySize6                                             -5.81389
## TibialTraySize7                                             -5.61272
## TibialTraySize8                                             -6.20706
## TibialTraySize9                                            -20.39510
## TibialTraySizeD                                            -26.29187
## TibialTraySizeF                                            -22.57134
## TibialTraySizeH                                                   NA
## TibialInsertModelAttune                                           NA
## TibialInsertModelDurasul                                   -22.37945
## TibialInsertModelGenesis II                                 -1.77872
## TibialInsertModelGenesis II PS                              -8.34625
## TibialInsertModelJourney                                          NA
## TibialInsertModelLegion                                     -1.86924
## TibialInsertModelLegion CR XLPE                              0.99616
## TibialInsertModelLegion XLPE                                      NA
## TibialInsertModelNatural-Knee II                                  NA
## TibialInsertModelNexgen                                     -4.06310
## TibialInsertModelNexgen Legacy                             -24.83492
## TibialInsertModelPersona                                          NA
## TibialInsertModelPFC Sigma                                  28.53028
## TibialInsertModelPFC Sigma Cross-Linked                     30.20885
## TibialInsertModelPFC Sigma,Modular                                NA
## TibialInsertModelProlong                                          NA
## TibialInsertModelRotating Platform                                NA
## TibialInsertModelScorpio                                   -15.15883
## TibialInsertModelSigma                                      18.16374
## TibialInsertModelTriathlon                                  -0.21398
## TibialInsertModelTriathlon X3                                3.96823
## TibialInsertModelTriathlon X3 Total Stabilizer                    NA
## TibialInsertWidth                                           -0.07218
## TibialInsertTypeCS (Condylar Stabilizing)                    3.63316
## TibialInsertTypecurved                                       3.31718
## TibialInsertTypecurved,Fixed bearing                         2.40896
## TibialInsertTypedished                                       0.71376
## TibialInsertTypeFixed bearing                                5.32072
## TibialInsertTypeFixed bearing,curved                         1.73647
## TibialInsertTypeFixed bearing,posterior stabilized          -5.91112
## TibialInsertTypehigh flex                                    2.81563
## TibialInsertTypehigh flex,curved                           -15.41864
## TibialInsertTypehigh flex,Fixed bearing                     -1.96367
## TibialInsertTypelipped,Fixed bearing                        10.80613
## TibialInsertTypemobile bearing                               2.98318
## TibialInsertTypemobile bearing,posterior stabilized         -9.80464
## TibialInsertTypeposterior stabilized                         4.52890
## TibialInsertTypeposterior stabilized,curved                 12.18708
## TibialInsertTypeposterior stabilized,curved,Fixed bearing    3.28908
## TibialInsertTypeposterior stabilized,Fixed bearing           0.05072
## TibialInsertTypestandard                                     6.39496
## TibialInsertTypestandard,Fixed bearing                       1.27742
## TibialInsertTypeTS (Total Stabilizer)                        1.13084
## TibialInsertTypeTS (Total Stabilizer),Fixed bearing         -9.14234
## PatellaModelAttune                                                NA
## PatellaModelGenesis II                                       5.01862
## PatellaModelNatural-Knee II                                       NA
## PatellaModelNexGen                                           4.21824
## PatellaModelNexGen Legacy                                   -2.87566
## PatellaModelOval Dome Patella 3-peg                         -3.80728
## PatellaModelPatella Round Dome                              -6.95292
## PatellaModelPersona                                               NA
## PatellaModelPFC Sigma                                       -0.71954
## PatellaModelTriathlon                                       -2.19911
## PatellaModelTriathlon Asymmetric Patella                     0.05584
## PatellaDiameter                                              0.05812
##                                                            Std. Error
## (Intercept)                                                  32.38422
## Surgeon2                                                      1.94491
## Surgeon3                                                      1.84162
## Surgeon4                                                      1.74183
## Surgeon5                                                      1.97887
## Year2011                                                      1.39996
## Year2012                                                      1.38108
## Year2013                                                      1.39028
## Year2014                                                      1.59422
## Year2015                                                      1.80446
## Year2016                                                      2.93910
## Age                                                           0.04184
## GenderMale                                                    1.18194
## Weight                                                        0.07240
## Height                                                        0.44334
## BMI                                                           0.45608
## DiagnosisInflammatory  non-infection                          9.13640
## DiagnosisInflammatory non-infection                          10.21111
## DiagnosisOsteoarthritis                                       5.50850
## DiagnosisPaget's disease                                     15.40286
## DiagnosisPost-trauma                                          6.11385
## DiagnosisPsoriatic Arthritis                                 15.34624
## DiagnosisRheumatoid arthritis                                 6.41512
## DiagnosisRheumatoid arthritis, Inflammatory  non-infection    9.94864
## RaceBlack or African American                                 4.42316
## RaceHispanic                                                  3.81954
## RaceIndian Sub-Continent                                     10.71280
## RaceMid-East Arabian                                         10.59663
## RaceMultiple Races                                           10.64078
## RaceOther                                                     4.62764
## RacePacific Islander                                         14.60124
## RaceUnknown                                                   4.95087
## RaceWhite                                                     3.28429
## SideLeft                                                      0.87505
## SideRight                                                     0.85947
## KneeScore_Patient                                             0.01891
## GPSYes                                                        1.03986
## ManufacturerSmith & Nephew                                    8.37257
## ManufacturerStryker                                           7.39101
## ManufacturerZimmer                                           10.94542
## FemoralComponentModelGenesis II                              14.30536
## FemoralComponentModelJourney                                 18.59560
## FemoralComponentModelLegion                                  13.63817
## FemoralComponentModelNatural Knee II                         21.05376
## FemoralComponentModelNexgen                                  27.74342
## FemoralComponentModelNexgen Legacy                           28.32308
## FemoralComponentModelNon-porous                              23.67199
## FemoralComponentModelNonporous                               23.28013
## FemoralComponentModelPersona                                 10.30303
## FemoralComponentModelPFC Sigma                               19.73980
## FemoralComponentModelSigma                                   18.30815
## FemoralComponentModelSigma TC3                               23.50602
## FemoralComponentModelTriathlon                               11.80054
## FemoralComponentModelTriathlon Total Stabilizer              14.39678
## FemoralComponentModelZimmer Precoat                          27.52343
## FemoralComponentSize10                                       14.93298
## FemoralComponentSize12                                       16.77194
## FemoralComponentSize2                                         8.49888
## FemoralComponentSize2.5                                       9.02211
## FemoralComponentSize3                                         8.33040
## FemoralComponentSize3N                                       11.71841
## FemoralComponentSize4                                         8.36042
## FemoralComponentSize4N                                        8.53076
## FemoralComponentSize5                                         8.42169
## FemoralComponentSize5N                                        8.63502
## FemoralComponentSize6                                         8.52588
## FemoralComponentSize6n                                       13.27649
## FemoralComponentSize6N                                        8.84450
## FemoralComponentSize7                                         8.67281
## FemoralComponentSize7+                                       16.66054
## FemoralComponentSize73                                       16.71433
## FemoralComponentSize8                                         8.96390
## FemoralComponentSize8+                                       16.70852
## FemoralComponentSize9                                        11.03906
## FemoralComponentSizeD                                        16.13281
## FemoralComponentSizeE                                        15.05931
## FemoralComponentSizeE -                                            NA
## FemoralComponentSizeE-                                       20.99607
## FemoralComponentSizeF                                        16.01777
## FemoralComponentSizeG                                        19.46720
## FemoralComponentSizeH                                        28.87886
## FemoralComponentTypecruciate retaining                        8.81170
## FemoralComponentTypeposterior stabilized                      8.81335
## TibialTrayModelGenesis II                                     5.19401
## TibialTrayModelJourney                                             NA
## TibialTrayModelLegion                                              NA
## TibialTrayModelM.B.T DePuy                                   18.80335
## TibialTrayModelM.B.T. Revision                               21.91736
## TibialTrayModelMBT Keel                                      18.18146
## TibialTrayModelModular                                       27.21795
## TibialTrayModelNatural-Knee II                                     NA
## TibialTrayModelNexgen                                        31.68197
## TibialTrayModelNexgen Legacy                                       NA
## TibialTrayModelPersona                                             NA
## TibialTrayModelPFC Sigma                                     22.91731
## TibialTrayModelPFC Sigma,Modular                                   NA
## TibialTrayModelTriathlon                                           NA
## TibialTraySize10                                             15.52482
## TibialTraySize2                                              10.51634
## TibialTraySize2.5                                            10.86921
## TibialTraySize3                                              10.58328
## TibialTraySize3/4/2016 0:00                                  13.50168
## TibialTraySize32                                             14.98073
## TibialTraySize4                                              10.60363
## TibialTraySize5                                              10.68184
## TibialTraySize5/6/2016 0:00                                  17.81406
## TibialTraySize6                                              10.79266
## TibialTraySize7                                              10.91583
## TibialTraySize8                                              11.19861
## TibialTraySize9                                              13.37900
## TibialTraySizeD                                              18.23804
## TibialTraySizeF                                              18.39743
## TibialTraySizeH                                                    NA
## TibialInsertModelAttune                                            NA
## TibialInsertModelDurasul                                     22.01654
## TibialInsertModelGenesis II                                   5.03691
## TibialInsertModelGenesis II PS                                7.03771
## TibialInsertModelJourney                                           NA
## TibialInsertModelLegion                                       5.07040
## TibialInsertModelLegion CR XLPE                               5.21939
## TibialInsertModelLegion XLPE                                       NA
## TibialInsertModelNatural-Knee II                                   NA
## TibialInsertModelNexgen                                      12.55582
## TibialInsertModelNexgen Legacy                               16.41055
## TibialInsertModelPersona                                           NA
## TibialInsertModelPFC Sigma                                   14.47970
## TibialInsertModelPFC Sigma Cross-Linked                      14.84704
## TibialInsertModelPFC Sigma,Modular                                 NA
## TibialInsertModelProlong                                           NA
## TibialInsertModelRotating Platform                                 NA
## TibialInsertModelScorpio                                     15.23609
## TibialInsertModelSigma                                       16.04480
## TibialInsertModelTriathlon                                    4.82877
## TibialInsertModelTriathlon X3                                 4.96113
## TibialInsertModelTriathlon X3 Total Stabilizer                     NA
## TibialInsertWidth                                             0.20280
## TibialInsertTypeCS (Condylar Stabilizing)                     7.52301
## TibialInsertTypecurved                                        8.12192
## TibialInsertTypecurved,Fixed bearing                          7.89182
## TibialInsertTypedished                                        8.00652
## TibialInsertTypeFixed bearing                                 7.85434
## TibialInsertTypeFixed bearing,curved                         13.13132
## TibialInsertTypeFixed bearing,posterior stabilized           13.56780
## TibialInsertTypehigh flex                                     7.34540
## TibialInsertTypehigh flex,curved                             12.63351
## TibialInsertTypehigh flex,Fixed bearing                      17.30055
## TibialInsertTypelipped,Fixed bearing                         16.92708
## TibialInsertTypemobile bearing                                9.08744
## TibialInsertTypemobile bearing,posterior stabilized          14.31335
## TibialInsertTypeposterior stabilized                          7.53633
## TibialInsertTypeposterior stabilized,curved                  16.24873
## TibialInsertTypeposterior stabilized,curved,Fixed bearing    16.93149
## TibialInsertTypeposterior stabilized,Fixed bearing            8.52750
## TibialInsertTypestandard                                      7.48079
## TibialInsertTypestandard,Fixed bearing                       17.83097
## TibialInsertTypeTS (Total Stabilizer)                         8.46987
## TibialInsertTypeTS (Total Stabilizer),Fixed bearing          16.23051
## PatellaModelAttune                                                 NA
## PatellaModelGenesis II                                        7.53765
## PatellaModelNatural-Knee II                                        NA
## PatellaModelNexGen                                           32.17471
## PatellaModelNexGen Legacy                                    11.24018
## PatellaModelOval Dome Patella 3-peg                           3.54140
## PatellaModelPatella Round Dome                                4.56766
## PatellaModelPersona                                                NA
## PatellaModelPFC Sigma                                        15.84998
## PatellaModelTriathlon                                         7.95086
## PatellaModelTriathlon Asymmetric Patella                      6.93708
## PatellaDiameter                                               0.06319
##                                                            t value
## (Intercept)                                                  1.102
## Surgeon2                                                    -0.232
## Surgeon3                                                    -0.597
## Surgeon4                                                    -1.215
## Surgeon5                                                    -1.879
## Year2011                                                     1.265
## Year2012                                                     1.465
## Year2013                                                    -0.380
## Year2014                                                     1.912
## Year2015                                                     1.525
## Year2016                                                     1.432
## Age                                                          6.825
## GenderMale                                                   0.901
## Weight                                                       0.790
## Height                                                      -0.491
## BMI                                                         -0.834
## DiagnosisInflammatory  non-infection                        -1.041
## DiagnosisInflammatory non-infection                         -1.307
## DiagnosisOsteoarthritis                                      0.141
## DiagnosisPaget's disease                                     1.203
## DiagnosisPost-trauma                                        -0.709
## DiagnosisPsoriatic Arthritis                                -0.523
## DiagnosisRheumatoid arthritis                                1.192
## DiagnosisRheumatoid arthritis, Inflammatory  non-infection   0.040
## RaceBlack or African American                               -1.058
## RaceHispanic                                                -2.210
## RaceIndian Sub-Continent                                    -1.805
## RaceMid-East Arabian                                        -0.204
## RaceMultiple Races                                          -0.270
## RaceOther                                                   -1.986
## RacePacific Islander                                        -1.297
## RaceUnknown                                                 -1.434
## RaceWhite                                                   -2.360
## SideLeft                                                    11.012
## SideRight                                                   11.045
## KneeScore_Patient                                           13.627
## GPSYes                                                       0.310
## ManufacturerSmith & Nephew                                  -0.460
## ManufacturerStryker                                         -0.962
## ManufacturerZimmer                                           0.501
## FemoralComponentModelGenesis II                             -0.254
## FemoralComponentModelJourney                                 1.435
## FemoralComponentModelLegion                                  0.076
## FemoralComponentModelNatural Knee II                        -0.070
## FemoralComponentModelNexgen                                  1.340
## FemoralComponentModelNexgen Legacy                           0.851
## FemoralComponentModelNon-porous                             -0.019
## FemoralComponentModelNonporous                              -0.917
## FemoralComponentModelPersona                                -0.896
## FemoralComponentModelPFC Sigma                              -0.651
## FemoralComponentModelSigma                                  -0.753
## FemoralComponentModelSigma TC3                              -1.697
## FemoralComponentModelTriathlon                               0.377
## FemoralComponentModelTriathlon Total Stabilizer             -1.107
## FemoralComponentModelZimmer Precoat                          1.016
## FemoralComponentSize10                                       0.324
## FemoralComponentSize12                                      -0.725
## FemoralComponentSize2                                        0.265
## FemoralComponentSize2.5                                      0.259
## FemoralComponentSize3                                        0.600
## FemoralComponentSize3N                                       1.256
## FemoralComponentSize4                                        0.491
## FemoralComponentSize4N                                       0.721
## FemoralComponentSize5                                        0.405
## FemoralComponentSize5N                                       0.565
## FemoralComponentSize6                                        0.204
## FemoralComponentSize6n                                       0.629
## FemoralComponentSize6N                                       0.521
## FemoralComponentSize7                                       -0.075
## FemoralComponentSize7+                                      -0.070
## FemoralComponentSize73                                       0.017
## FemoralComponentSize8                                        0.126
## FemoralComponentSize8+                                      -0.203
## FemoralComponentSize9                                       -0.428
## FemoralComponentSizeD                                       -0.715
## FemoralComponentSizeE                                        0.034
## FemoralComponentSizeE -                                         NA
## FemoralComponentSizeE-                                      -1.973
## FemoralComponentSizeF                                        0.023
## FemoralComponentSizeG                                       -1.490
## FemoralComponentSizeH                                       -0.790
## FemoralComponentTypecruciate retaining                      -0.475
## FemoralComponentTypeposterior stabilized                    -0.615
## TibialTrayModelGenesis II                                   -0.051
## TibialTrayModelJourney                                          NA
## TibialTrayModelLegion                                           NA
## TibialTrayModelM.B.T DePuy                                   1.142
## TibialTrayModelM.B.T. Revision                               1.173
## TibialTrayModelMBT Keel                                      0.901
## TibialTrayModelModular                                      -0.272
## TibialTrayModelNatural-Knee II                                  NA
## TibialTrayModelNexgen                                       -0.730
## TibialTrayModelNexgen Legacy                                    NA
## TibialTrayModelPersona                                          NA
## TibialTrayModelPFC Sigma                                    -0.525
## TibialTrayModelPFC Sigma,Modular                                NA
## TibialTrayModelTriathlon                                        NA
## TibialTraySize10                                            -1.625
## TibialTraySize2                                             -0.602
## TibialTraySize2.5                                           -0.389
## TibialTraySize3                                             -0.605
## TibialTraySize3/4/2016 0:00                                 -1.052
## TibialTraySize32                                             1.554
## TibialTraySize4                                             -0.567
## TibialTraySize5                                             -0.547
## TibialTraySize5/6/2016 0:00                                  0.423
## TibialTraySize6                                             -0.539
## TibialTraySize7                                             -0.514
## TibialTraySize8                                             -0.554
## TibialTraySize9                                             -1.524
## TibialTraySizeD                                             -1.442
## TibialTraySizeF                                             -1.227
## TibialTraySizeH                                                 NA
## TibialInsertModelAttune                                         NA
## TibialInsertModelDurasul                                    -1.016
## TibialInsertModelGenesis II                                 -0.353
## TibialInsertModelGenesis II PS                              -1.186
## TibialInsertModelJourney                                        NA
## TibialInsertModelLegion                                     -0.369
## TibialInsertModelLegion CR XLPE                              0.191
## TibialInsertModelLegion XLPE                                    NA
## TibialInsertModelNatural-Knee II                                NA
## TibialInsertModelNexgen                                     -0.324
## TibialInsertModelNexgen Legacy                              -1.513
## TibialInsertModelPersona                                        NA
## TibialInsertModelPFC Sigma                                   1.970
## TibialInsertModelPFC Sigma Cross-Linked                      2.035
## TibialInsertModelPFC Sigma,Modular                              NA
## TibialInsertModelProlong                                        NA
## TibialInsertModelRotating Platform                              NA
## TibialInsertModelScorpio                                    -0.995
## TibialInsertModelSigma                                       1.132
## TibialInsertModelTriathlon                                  -0.044
## TibialInsertModelTriathlon X3                                0.800
## TibialInsertModelTriathlon X3 Total Stabilizer                  NA
## TibialInsertWidth                                           -0.356
## TibialInsertTypeCS (Condylar Stabilizing)                    0.483
## TibialInsertTypecurved                                       0.408
## TibialInsertTypecurved,Fixed bearing                         0.305
## TibialInsertTypedished                                       0.089
## TibialInsertTypeFixed bearing                                0.677
## TibialInsertTypeFixed bearing,curved                         0.132
## TibialInsertTypeFixed bearing,posterior stabilized          -0.436
## TibialInsertTypehigh flex                                    0.383
## TibialInsertTypehigh flex,curved                            -1.220
## TibialInsertTypehigh flex,Fixed bearing                     -0.114
## TibialInsertTypelipped,Fixed bearing                         0.638
## TibialInsertTypemobile bearing                               0.328
## TibialInsertTypemobile bearing,posterior stabilized         -0.685
## TibialInsertTypeposterior stabilized                         0.601
## TibialInsertTypeposterior stabilized,curved                  0.750
## TibialInsertTypeposterior stabilized,curved,Fixed bearing    0.194
## TibialInsertTypeposterior stabilized,Fixed bearing           0.006
## TibialInsertTypestandard                                     0.855
## TibialInsertTypestandard,Fixed bearing                       0.072
## TibialInsertTypeTS (Total Stabilizer)                        0.134
## TibialInsertTypeTS (Total Stabilizer),Fixed bearing         -0.563
## PatellaModelAttune                                              NA
## PatellaModelGenesis II                                       0.666
## PatellaModelNatural-Knee II                                     NA
## PatellaModelNexGen                                           0.131
## PatellaModelNexGen Legacy                                   -0.256
## PatellaModelOval Dome Patella 3-peg                         -1.075
## PatellaModelPatella Round Dome                              -1.522
## PatellaModelPersona                                             NA
## PatellaModelPFC Sigma                                       -0.045
## PatellaModelTriathlon                                       -0.277
## PatellaModelTriathlon Asymmetric Patella                     0.008
## PatellaDiameter                                              0.920
##                                                            Pr(>|t|)    
## (Intercept)                                                  0.2705    
## Surgeon2                                                     0.8162    
## Surgeon3                                                     0.5505    
## Surgeon4                                                     0.2245    
## Surgeon5                                                     0.0604 .  
## Year2011                                                     0.2062    
## Year2012                                                     0.1431    
## Year2013                                                     0.7037    
## Year2014                                                     0.0560 .  
## Year2015                                                     0.1275    
## Year2016                                                     0.1523    
## Age                                                        1.19e-11 ***
## GenderMale                                                   0.3678    
## Weight                                                       0.4295    
## Height                                                       0.6232    
## BMI                                                          0.4046    
## DiagnosisInflammatory  non-infection                         0.2982    
## DiagnosisInflammatory non-infection                          0.1912    
## DiagnosisOsteoarthritis                                      0.8882    
## DiagnosisPaget's disease                                     0.2290    
## DiagnosisPost-trauma                                         0.4781    
## DiagnosisPsoriatic Arthritis                                 0.6012    
## DiagnosisRheumatoid arthritis                                0.2334    
## DiagnosisRheumatoid arthritis, Inflammatory  non-infection   0.9681    
## RaceBlack or African American                                0.2900    
## RaceHispanic                                                 0.0272 *  
## RaceIndian Sub-Continent                                     0.0712 .  
## RaceMid-East Arabian                                         0.8384    
## RaceMultiple Races                                           0.7868    
## RaceOther                                                    0.0472 *  
## RacePacific Islander                                         0.1947    
## RaceUnknown                                                  0.1519    
## RaceWhite                                                    0.0184 *  
## SideLeft                                                    < 2e-16 ***
## SideRight                                                   < 2e-16 ***
## KneeScore_Patient                                           < 2e-16 ***
## GPSYes                                                       0.7568    
## ManufacturerSmith & Nephew                                   0.6453    
## ManufacturerStryker                                          0.3364    
## ManufacturerZimmer                                           0.6161    
## FemoralComponentModelGenesis II                              0.7999    
## FemoralComponentModelJourney                                 0.1516    
## FemoralComponentModelLegion                                  0.9396    
## FemoralComponentModelNatural Knee II                         0.9445    
## FemoralComponentModelNexgen                                  0.1805    
## FemoralComponentModelNexgen Legacy                           0.3950    
## FemoralComponentModelNon-porous                              0.9851    
## FemoralComponentModelNonporous                               0.3593    
## FemoralComponentModelPersona                                 0.3704    
## FemoralComponentModelPFC Sigma                               0.5150    
## FemoralComponentModelSigma                                   0.4513    
## FemoralComponentModelSigma TC3                               0.0898 .  
## FemoralComponentModelTriathlon                               0.7060    
## FemoralComponentModelTriathlon Total Stabilizer              0.2683    
## FemoralComponentModelZimmer Precoat                          0.3099    
## FemoralComponentSize10                                       0.7460    
## FemoralComponentSize12                                       0.4688    
## FemoralComponentSize2                                        0.7911    
## FemoralComponentSize2.5                                      0.7954    
## FemoralComponentSize3                                        0.5489    
## FemoralComponentSize3N                                       0.2093    
## FemoralComponentSize4                                        0.6232    
## FemoralComponentSize4N                                       0.4708    
## FemoralComponentSize5                                        0.6856    
## FemoralComponentSize5N                                       0.5724    
## FemoralComponentSize6                                        0.8387    
## FemoralComponentSize6n                                       0.5295    
## FemoralComponentSize6N                                       0.6022    
## FemoralComponentSize7                                        0.9404    
## FemoralComponentSize7+                                       0.9442    
## FemoralComponentSize73                                       0.9865    
## FemoralComponentSize8                                        0.8994    
## FemoralComponentSize8+                                       0.8389    
## FemoralComponentSize9                                        0.6691    
## FemoralComponentSizeD                                        0.4747    
## FemoralComponentSizeE                                        0.9729    
## FemoralComponentSizeE -                                          NA    
## FemoralComponentSizeE-                                       0.0486 *  
## FemoralComponentSizeF                                        0.9816    
## FemoralComponentSizeG                                        0.1364    
## FemoralComponentSizeH                                        0.4299    
## FemoralComponentTypecruciate retaining                       0.6351    
## FemoralComponentTypeposterior stabilized                     0.5387    
## TibialTrayModelGenesis II                                    0.9591    
## TibialTrayModelJourney                                           NA    
## TibialTrayModelLegion                                            NA    
## TibialTrayModelM.B.T DePuy                                   0.2537    
## TibialTrayModelM.B.T. Revision                               0.2411    
## TibialTrayModelMBT Keel                                      0.3680    
## TibialTrayModelModular                                       0.7855    
## TibialTrayModelNatural-Knee II                                   NA    
## TibialTrayModelNexgen                                        0.4655    
## TibialTrayModelNexgen Legacy                                     NA    
## TibialTrayModelPersona                                           NA    
## TibialTrayModelPFC Sigma                                     0.5994    
## TibialTrayModelPFC Sigma,Modular                                 NA    
## TibialTrayModelTriathlon                                         NA    
## TibialTraySize10                                             0.1044    
## TibialTraySize2                                              0.5469    
## TibialTraySize2.5                                            0.6975    
## TibialTraySize3                                              0.5452    
## TibialTraySize3/4/2016 0:00                                  0.2928    
## TibialTraySize32                                             0.1203    
## TibialTraySize4                                              0.5709    
## TibialTraySize5                                              0.5841    
## TibialTraySize5/6/2016 0:00                                  0.6721    
## TibialTraySize6                                              0.5902    
## TibialTraySize7                                              0.6072    
## TibialTraySize8                                              0.5795    
## TibialTraySize9                                              0.1276    
## TibialTraySizeD                                              0.1496    
## TibialTraySizeF                                              0.2200    
## TibialTraySizeH                                                  NA    
## TibialInsertModelAttune                                          NA    
## TibialInsertModelDurasul                                     0.3095    
## TibialInsertModelGenesis II                                  0.7240    
## TibialInsertModelGenesis II PS                               0.2358    
## TibialInsertModelJourney                                         NA    
## TibialInsertModelLegion                                      0.7124    
## TibialInsertModelLegion CR XLPE                              0.8487    
## TibialInsertModelLegion XLPE                                     NA    
## TibialInsertModelNatural-Knee II                                 NA    
## TibialInsertModelNexgen                                      0.7463    
## TibialInsertModelNexgen Legacy                               0.1304    
## TibialInsertModelPersona                                         NA    
## TibialInsertModelPFC Sigma                                   0.0489 *  
## TibialInsertModelPFC Sigma Cross-Linked                      0.0420 *  
## TibialInsertModelPFC Sigma,Modular                               NA    
## TibialInsertModelProlong                                         NA    
## TibialInsertModelRotating Platform                               NA    
## TibialInsertModelScorpio                                     0.3199    
## TibialInsertModelSigma                                       0.2578    
## TibialInsertModelTriathlon                                   0.9647    
## TibialInsertModelTriathlon X3                                0.4239    
## TibialInsertModelTriathlon X3 Total Stabilizer                   NA    
## TibialInsertWidth                                            0.7219    
## TibialInsertTypeCS (Condylar Stabilizing)                    0.6292    
## TibialInsertTypecurved                                       0.6830    
## TibialInsertTypecurved,Fixed bearing                         0.7602    
## TibialInsertTypedished                                       0.9290    
## TibialInsertTypeFixed bearing                                0.4982    
## TibialInsertTypeFixed bearing,curved                         0.8948    
## TibialInsertTypeFixed bearing,posterior stabilized           0.6631    
## TibialInsertTypehigh flex                                    0.7015    
## TibialInsertTypehigh flex,curved                             0.2225    
## TibialInsertTypehigh flex,Fixed bearing                      0.9096    
## TibialInsertTypelipped,Fixed bearing                         0.5233    
## TibialInsertTypemobile bearing                               0.7427    
## TibialInsertTypemobile bearing,posterior stabilized          0.4934    
## TibialInsertTypeposterior stabilized                         0.5480    
## TibialInsertTypeposterior stabilized,curved                  0.4533    
## TibialInsertTypeposterior stabilized,curved,Fixed bearing    0.8460    
## TibialInsertTypeposterior stabilized,Fixed bearing           0.9953    
## TibialInsertTypestandard                                     0.3927    
## TibialInsertTypestandard,Fixed bearing                       0.9429    
## TibialInsertTypeTS (Total Stabilizer)                        0.8938    
## TibialInsertTypeTS (Total Stabilizer),Fixed bearing          0.5733    
## PatellaModelAttune                                               NA    
## PatellaModelGenesis II                                       0.5056    
## PatellaModelNatural-Knee II                                      NA    
## PatellaModelNexGen                                           0.8957    
## PatellaModelNexGen Legacy                                    0.7981    
## PatellaModelOval Dome Patella 3-peg                          0.2825    
## PatellaModelPatella Round Dome                               0.1281    
## PatellaModelPersona                                              NA    
## PatellaModelPFC Sigma                                        0.9638    
## PatellaModelTriathlon                                        0.7821    
## PatellaModelTriathlon Asymmetric Patella                     0.9936    
## PatellaDiameter                                              0.3578    
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
## 
## Residual standard error: 14.11 on 1821 degrees of freedom
## Multiple R-squared:  0.2718, Adjusted R-squared:  0.2131 
## F-statistic: 4.625 on 147 and 1821 DF,  p-value: < 2.2e-16
knee_model_aic = step(knee_model, direction = "backward", trace = 0)
summary(knee_model_aic)$call
## lm(formula = KneeScore_Surgeon ~ Surgeon + Year + Age + Side + 
##     KneeScore_Patient, data = knee)
knee_model_bic = step(knee_model, direction = "backward", trace = 0)
summary(knee_model_bic)$call
## lm(formula = KneeScore_Surgeon ~ Surgeon + Year + Age + Side + 
##     KneeScore_Patient, data = knee)
anova(knee_model_aic, knee_model)
## Analysis of Variance Table
## 
## Model 1: KneeScore_Surgeon ~ Surgeon + Year + Age + Side + KneeScore_Patient
>>>>>>> acc04d98d5bfe4ca322513e908756d502a7488a5
## Model 2: KneeScore_Surgeon ~ Surgeon + Year + Age + Gender + Weight + 
##     Height + BMI + Diagnosis + Race + Side + KneeScore_Patient + 
##     GPS + Manufacturer + FemoralComponentModel + FemoralComponentSize + 
##     FemoralComponentType + TibialTrayModel + TibialTraySize + 
##     TibialInsertModel + TibialInsertWidth + TibialInsertType + 
##     PatellaModel + PatellaDiameter
<<<<<<< HEAD
##   Res.Df    RSS  Df Sum of Sq      F   Pr(>F)    
## 1   1964 404744                                  
## 2   1821 362468 143     42276 1.4853 0.000288 ***
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

The resulting BIC model scores marginally better in Cross Validation though the anova results clearly prefer the full additive model.

Next, we will evaluate the RMSE of the best performing models thus far.

options(warn=-1)
sample_idx = sample(1:nrow(knee), 1500)
train_data = knee[sample_idx, ]
test_data = knee[-sample_idx, ]

rmse  = function(actual, predicted) {  sqrt(mean((actual - predicted) ^ 2))}

full_test = rmse(test_data$KneeScore_Surgeon, predict(full_additive_model, test_data))
full_train = rmse(train_data$KneeScore_Surgeon, predict(full_additive_model, train_data))

test_mod_8 = rmse(test_data$KneeScore_Surgeon, predict(model_eight, test_data))
train_mod_8 = rmse(train_data$KneeScore_Surgeon, predict(model_eight, train_data))

bic_test = rmse(test_data$KneeScore_Surgeon, predict(model_back_bic, test_data))
bic_train = rmse(train_data$KneeScore_Surgeon, predict(model_back_bic, train_data))
Model Train RMSE Test RMSE
Model All+ 13.5971116 13.4739255
Model 9 13.9817512 13.5709474
Model BIC 14.4076613 14.1099068

** how are these evalueated again?**

Perhaps removing some influential data points will assist in creating a better model. Next we remove influential points that satisfy the criteria for Large Leverage, Large Cook's Distance and Large RStandard values.

new_dataframe = removeInfluential(model_eight, knee)
model_nine = lm(KneeScore_Surgeon ~ . - Diagnosis - Race - TibialInsertType - FemoralComponentModel - PatellaModel - TibialTrayModel - TibialTraySize - FemoralComponentType, data = new_dataframe)
evaluate(model_nine)
##                   Test       Result
## 1  Shapiro Wilk pvalue 6.718107e-07
## 2 Breusch-Pagan pvalue    0.2390135
## 3            R Squared     0.241005
## 4        Adj R Squared    0.2124774
## 5     Cross Validation     14.17558
## 6     Large Hat Values          119

This does seem to be of moderate benefit in relation to Adjusted \(R^2\) and Cross Validation.

We will evaluate the same model one again after removing the identified large hat values.

remove_hat_dataframe = removeLargeHatValues(model_nine, new_dataframe)
model_ten = lm(KneeScore_Surgeon ~ . - Diagnosis - Race - TibialInsertType - FemoralComponentModel - PatellaModel - TibialTrayModel - TibialTraySize - FemoralComponentType, data = remove_hat_dataframe)
evaluate(model_ten)
##                   Test       Result
## 1  Shapiro Wilk pvalue 3.604033e-06
## 2 Breusch-Pagan pvalue   0.08198251
## 3            R Squared    0.2187277
## 4        Adj R Squared    0.2013757
## 5     Cross Validation     14.32124
## 6     Large Hat Values           93

Removing the large hat values decreased the Cross Validation score. Perhaps we can find a better model by running BIC on the modified dataframe.

n = length(resid(model_ten))
model_10_bic = step(model_ten, direction = "backward", k = log(n), trace = 0)
evaluate(model_10_bic)
##                   Test       Result
## 1  Shapiro Wilk pvalue 0.0001065612
## 2 Breusch-Pagan pvalue 8.666681e-05
## 3            R Squared    0.1862472
## 4        Adj R Squared    0.1844753
## 5     Cross Validation     14.33378
## 6     Large Hat Values           79

Despite the reduction in leveraged values the selected model is no better than preceeding models.

sample_idx_two = sample(1:nrow(remove_hat_dataframe), 1300)
train_data_two = remove_hat_dataframe[sample_idx_two, ]
test_data_two = remove_hat_dataframe[-sample_idx_two, ]

test_mod_10 = rmse(test_data_two$KneeScore_Surgeon, predict(model_ten, train_data_two))
train_mod_10 = rmse(train_data_two$KneeScore_Surgeon, predict(model_ten, train_data_two))

test_10_bic = rmse(test_data_two$KneeScore_Surgeon, predict(model_10_bic, train_data_two))
train_10_bic = rmse(train_data_two$KneeScore_Surgeon, predict(model_10_bic, train_data_two))
Model Train RMSE Test RMSE
Model 10 13.7957748 17.5150555
Model 10 BIC 14.1082151 17.282833

Appendix

Knee Score Surgeon

model = lm(KneeScore_Surgeon ~. , data = knee)
model_back_aic = step(model, direction = "backward", trace = 0)
model_for_aic = step(model, direction = "forward", trace = 0)
model_both_aic = step(model, direction = "both", trace = 0)

n = length(resid(model))
model_back_bic = step(model, direction = "backward", k = log(n), trace = 0)
model_for_bic = step(model, direction = "forward", k = log(n), trace = 0)
model_both_bic = step(model, direction = "both", k = log(n), trace = 0)
evaluate(model)
##                   Test       Result
## 1  Shapiro Wilk pvalue 2.080142e-08
## 2 Breusch-Pagan pvalue    0.9192514
## 3            R Squared    0.2718411
## 4        Adj R Squared    0.2130606
## 5     Cross Validation      14.4126
## 6     Large Hat Values          177

evaluate(model_back_aic)
##                   Test       Result
## 1  Shapiro Wilk pvalue 1.360071e-06
## 2 Breusch-Pagan pvalue 0.0002559995
## 3            R Squared    0.2062052
## 4        Adj R Squared    0.2005178
## 5     Cross Validation     14.27567
## 6     Large Hat Values           36

evaluate(model_for_aic)
##                   Test       Result
## 1  Shapiro Wilk pvalue 2.080142e-08
## 2 Breusch-Pagan pvalue    0.9192514
## 3            R Squared    0.2718411
## 4        Adj R Squared    0.2130606
## 5     Cross Validation      14.4126
## 6     Large Hat Values          177

evaluate(model_both_aic)
##                   Test       Result
## 1  Shapiro Wilk pvalue 1.360071e-06
## 2 Breusch-Pagan pvalue 0.0002559995
## 3            R Squared    0.2062052
## 4        Adj R Squared    0.2005178
## 5     Cross Validation     14.27567
## 6     Large Hat Values           36

## Adj.R.Squared: 0.24 - CrossValidation: 16.9
evaluate(model_back_bic)
##                   Test       Result
## 1  Shapiro Wilk pvalue 1.169168e-05
## 2 Breusch-Pagan pvalue 0.0001309477
## 3            R Squared    0.1869122
## 4        Adj R Squared    0.1852562
## 5     Cross Validation     14.37628
## 6     Large Hat Values           87

evaluate(model_for_bic)
##                   Test       Result
## 1  Shapiro Wilk pvalue 2.080142e-08
## 2 Breusch-Pagan pvalue    0.9192514
## 3            R Squared    0.2718411
## 4        Adj R Squared    0.2130606
## 5     Cross Validation      14.4126
## 6     Large Hat Values          177

evaluate(model_both_bic)
##                   Test       Result
## 1  Shapiro Wilk pvalue 1.169168e-05
## 2 Breusch-Pagan pvalue 0.0001309477
## 3            R Squared    0.1869122
## 4        Adj R Squared    0.1852562
## 5     Cross Validation     14.37628
## 6     Large Hat Values           87

## Remove large leverages
new_data = removeInfluential(model, knee)
model_without_large_leverages = lm(KneeScore_Surgeon ~ ., data = new_data)
evaluate(model_without_large_leverages)
##                   Test       Result
## 1  Shapiro Wilk pvalue 1.780351e-08
## 2 Breusch-Pagan pvalue     0.976812
## 3            R Squared    0.2753609
## 4        Adj R Squared    0.2167679
## 5     Cross Validation     14.32084
## 6     Large Hat Values          179

Removing the records with large hat values and refitting simply introduces more large hat values. What is unique about these rows of data?

which(hatvalues(model) == 1.0)
##   17   19   74  240  244  258  354  533  638  703  763  844  876  881  891 
##   17   19   74  240  244  258  354  533  638  703  763  844  876  881  891 
## 1047 1093 1096 1245 1374 1406 1544 1554 1686 1775 1778 1781 1919 1927 1938 
## 1047 1093 1096 1245 1374 1406 1544 1554 1686 1775 1778 1781 1919 1927 1938
print(knee[17,])
##    Surgeon Year Age Gender Weight Height   BMI      Diagnosis  Race Side
## 17       3 2010  68 Female    148     64 25.43 Osteoarthritis White Left
##    KneeScore_Surgeon KneeScore_Patient GPS Manufacturer
## 17                48                50  No  DePuy (J&J)
##    FemoralComponentModel FemoralComponentSize FemoralComponentType
## 17             Nonporous                    2   cruciate retaining
##    TibialTrayModel TibialTraySize TibialInsertModel TibialInsertWidth
## 17       PFC Sigma            2.5         PFC Sigma                 8
##    TibialInsertType      PatellaModel PatellaDiameter
## 17           curved 3-Post Round Dome              32
print(knee[240,])
##     Surgeon Year Age Gender Weight Height   BMI      Diagnosis  Race
## 240       3 2010  66 Female    130     68 19.78 Osteoarthritis White
##          Side KneeScore_Surgeon KneeScore_Patient GPS Manufacturer
## 240 Bilateral                60                80  No       Zimmer
##     FemoralComponentModel FemoralComponentSize FemoralComponentType
## 240        Zimmer Precoat                  E -   cruciate retaining
##     TibialTrayModel TibialTraySize TibialInsertModel TibialInsertWidth
## 240          Nexgen              4            Nexgen                10
##     TibialInsertType PatellaModel PatellaDiameter
## 240         standard       NexGen              32
print(knee[638,])
##     Surgeon Year Age Gender Weight Height   BMI      Diagnosis  Race  Side
## 638       3 2012  67 Female    167     62 30.57 Osteoarthritis White Right
##     KneeScore_Surgeon KneeScore_Patient GPS Manufacturer
## 638                55                55 Yes  DePuy (J&J)
##     FemoralComponentModel FemoralComponentSize FemoralComponentType
## 638                 Sigma                   4N   cruciate retaining
##     TibialTrayModel TibialTraySize TibialInsertModel TibialInsertWidth
## 638       PFC Sigma              3         PFC Sigma                 8
##            TibialInsertType                 PatellaModel PatellaDiameter
## 638 high flex,Fixed bearing Triathlon Asymmetric Patella              35
print(knee[1093,])
##      Surgeon Year Age Gender Weight Height   BMI      Diagnosis  Race
## 1093       3 2013  83 Female    130     63 23.05 Osteoarthritis White
##       Side KneeScore_Surgeon KneeScore_Patient GPS Manufacturer
## 1093 Right                66                50  No  DePuy (J&J)
##      FemoralComponentModel FemoralComponentSize FemoralComponentType
## 1093                 Sigma                    3 posterior stabilized
##      TibialTrayModel TibialTraySize TibialInsertModel TibialInsertWidth
## 1093       PFC Sigma            2.5         PFC Sigma                 8
##                 TibialInsertType            PatellaModel PatellaDiameter
## 1093 posterior stabilized,curved Oval Dome Patella 3-peg              35
print(knee[1775, ])
##      Surgeon Year Age Gender Weight Height   BMI   Diagnosis  Race  Side
## 1775       3 2014  22   Male    240     77 28.49 Post-trauma White Right
##      KneeScore_Surgeon KneeScore_Patient GPS Manufacturer
## 1775                32                65  No       Zimmer
##      FemoralComponentModel FemoralComponentSize FemoralComponentType
## 1775               Persona                   12   cruciate retaining
##      TibialTrayModel TibialTraySize TibialInsertModel TibialInsertWidth
## 1775         Persona              H           Persona                10
##      TibialInsertType PatellaModel PatellaDiameter
## 1775    Fixed bearing      Persona              38

Each of these has a suspect factor entry. Perhaps the entry only occurs infrequently in the database and is causing the entire row to be flagged as high leverage? Try pulling the suspect columns out of the LM and see what happens.

model_nine = lm(KneeScore_Surgeon ~ . - TibialInsertType - FemoralComponentModel - PatellaModel - TibialTrayModel - FemoralComponentType, data = knee)
evaluate(model_nine)
##                   Test       Result
## 1  Shapiro Wilk pvalue 8.105873e-08
## 2 Breusch-Pagan pvalue    0.1679994
## 3            R Squared    0.2556416
## 4        Adj R Squared    0.2141109
## 5     Cross Validation     14.34666
## 6     Large Hat Values          138

======= ## Res.Df RSS Df Sum of Sq F Pr(>F) ## 1 1954 395140 ## 2 1821 362468 133 32673 1.2342 0.04065 * ## --- ## Signif. codes: 0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
plot(fitted(knee_model_aic), resid(knee_model_aic), col = "magenta", pch = 20, cex = 1.5, xlab = "Fitted", ylab = "Residuals")
abline(h=0, lwd = 2, col = "olivedrab1")

qqnorm(resid(knee_model_aic), main = "Normal QQ Plot - Knee Model AIC", col = "purple4", pch = 20, cex = 1.5)
qqline(resid(knee_model_aic), col = "Orange", lwd = 3)

library(lmtest)
bptest(knee_model_aic)
## 
##  studentized Breusch-Pagan test
## 
## data:  knee_model_aic
## BP = 39.99, df = 14, p-value = 0.000256
shapiro.test(resid(knee_model_aic))
## 
##  Shapiro-Wilk normality test
## 
## data:  resid(knee_model_aic)
## W = 0.99459, p-value = 1.36e-06
>>>>>>> acc04d98d5bfe4ca322513e908756d502a7488a5

Knee Score Patient

Begin by fitting a full additive model as a baseline and evaluating.

full_additive_model = lm(KneeScore_Patient ~ ., data = knee)
evaluate(full_additive_model)
##                   Test       Result
## 1  Shapiro Wilk pvalue 1.941649e-14
## 2 Breusch-Pagan pvalue    0.9539336
## 3            R Squared    0.3161338
## 4        Adj R Squared    0.2609288
## 5     Cross Validation     17.07365
## 6     Large Hat Values          178
<<<<<<< HEAD

=======

>>>>>>> acc04d98d5bfe4ca322513e908756d502a7488a5

The baseline model has many large leverage values, which, upon inspection may be coming from several of the implant fields. Perhaps removing the suspect implant predictors will help improve the model.

model_eight = lm(KneeScore_Patient ~ . - Diagnosis - Race - TibialInsertType - FemoralComponentModel - PatellaModel - TibialTrayModel - TibialTraySize - FemoralComponentType, data = knee)
evaluate(model_eight)
##                   Test       Result
## 1  Shapiro Wilk pvalue 1.115521e-14
## 2 Breusch-Pagan pvalue   0.00453633
## 3            R Squared    0.2810507
## 4        Adj R Squared    0.2537489
## 5     Cross Validation     16.88212
## 6     Large Hat Values          126
<<<<<<< HEAD

=======

>>>>>>> acc04d98d5bfe4ca322513e908756d502a7488a5
anova(model_eight, full_additive_model)
## Analysis of Variance Table
## 
## Model 1: KneeScore_Patient ~ (Surgeon + Year + Age + Gender + Weight + 
##     Height + BMI + Diagnosis + Race + Side + KneeScore_Surgeon + 
##     GPS + Manufacturer + FemoralComponentModel + FemoralComponentSize + 
##     FemoralComponentType + TibialTrayModel + TibialTraySize + 
##     TibialInsertModel + TibialInsertWidth + TibialInsertType + 
##     PatellaModel + PatellaDiameter) - Diagnosis - Race - TibialInsertType - 
##     FemoralComponentModel - PatellaModel - TibialTrayModel - 
##     TibialTraySize - FemoralComponentType
## Model 2: KneeScore_Patient ~ Surgeon + Year + Age + Gender + Weight + 
##     Height + BMI + Diagnosis + Race + Side + KneeScore_Surgeon + 
##     GPS + Manufacturer + FemoralComponentModel + FemoralComponentSize + 
##     FemoralComponentType + TibialTrayModel + TibialTraySize + 
##     TibialInsertModel + TibialInsertWidth + TibialInsertType + 
##     PatellaModel + PatellaDiameter
##   Res.Df    RSS Df Sum of Sq      F  Pr(>F)  
## 1   1896 530943                              
## 2   1821 505034 75     25909 1.2456 0.07874 .
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

While removing the predictors doesn’t yeild a definatively better model at a reasonable \(\alpha\), it does increase Adjusted \(R^2\) and slightly improves the cross validation score.

Utilize a Bayesian Information Criterion approach and evaluate the selected model.

n = length(resid(full_additive_model))
model_back_bic = step(full_additive_model, direction = "backward", k = log(n), trace = 0)
evaluate(model_back_bic)
##                   Test       Result
## 1  Shapiro Wilk pvalue 1.934109e-15
## 2 Breusch-Pagan pvalue 7.743929e-07
## 3            R Squared    0.2423439
## 4        Adj R Squared    0.2404141
## 5     Cross Validation     16.91176
## 6     Large Hat Values          112
anova(model_back_bic, full_additive_model)
## Analysis of Variance Table
## 
## Model 1: KneeScore_Patient ~ Age + Gender + Weight + Height + KneeScore_Surgeon
## Model 2: KneeScore_Patient ~ Surgeon + Year + Age + Gender + Weight + 
##     Height + BMI + Diagnosis + Race + Side + KneeScore_Surgeon + 
##     GPS + Manufacturer + FemoralComponentModel + FemoralComponentSize + 
##     FemoralComponentType + TibialTrayModel + TibialTraySize + 
##     TibialInsertModel + TibialInsertWidth + TibialInsertType + 
##     PatellaModel + PatellaDiameter
##   Res.Df    RSS  Df Sum of Sq      F   Pr(>F)   
## 1   1963 559528                                 
## 2   1821 505034 142     54494 1.3837 0.002592 **
## ---
## Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1
<<<<<<< HEAD

=======

>>>>>>> acc04d98d5bfe4ca322513e908756d502a7488a5

The resulting BIC model scores marginally better in Cross Validation though the anova results clearly prefer the full additive model.

Next, we will evaluate the RMSE of the best performing models thus far.

options(warn=-1)
sample_idx = sample(1:nrow(knee), 1500)
train_data = knee[sample_idx, ]
test_data = knee[-sample_idx, ]

rmse  = function(actual, predicted) {  sqrt(mean((actual - predicted) ^ 2))}

full_test = rmse(test_data$KneeScore_Patient, predict(full_additive_model, test_data))
full_train = rmse(train_data$KneeScore_Patient, predict(full_additive_model, train_data))

test_mod_8 = rmse(test_data$KneeScore_Patient, predict(model_eight, test_data))
train_mod_8 = rmse(train_data$KneeScore_Patient, predict(model_eight, train_data))

bic_test = rmse(test_data$KneeScore_Patient, predict(model_back_bic, test_data))
bic_train = rmse(train_data$KneeScore_Patient, predict(model_back_bic, train_data))
<<<<<<< HEAD ======= >>>>>>> acc04d98d5bfe4ca322513e908756d502a7488a5
Model Train RMSE Test RMSE
Model All+15.9136983 16.3363653
Model 9 16.1952365 17.1232936
Model BIC 16.6211189 17.591382816.1325071 15.6349101
Model 9 16.4395967 16.3615981
Model BIC 16.8626588 16.8401307

** how are these evalueated again?**

Perhaps removing some influential data points will assist in creating a better model. Next we remove influential points that satisfy the criteria for Large Leverage, Large Cook's Distance and Large RStandard values.

new_dataframe = removeInfluential(model_eight, knee)
model_nine = lm(KneeScore_Patient ~ . - Diagnosis - Race - TibialInsertType - FemoralComponentModel - PatellaModel - TibialTrayModel - TibialTraySize - FemoralComponentType, data = new_dataframe)
evaluate(model_nine)
##                   Test       Result
## 1  Shapiro Wilk pvalue 2.138818e-14
## 2 Breusch-Pagan pvalue  0.006017515
## 3            R Squared    0.2818728
## 4        Adj R Squared      0.25453
## 5     Cross Validation     16.75786
## 6     Large Hat Values          122
<<<<<<< HEAD

=======

>>>>>>> acc04d98d5bfe4ca322513e908756d502a7488a5

This does seem to be of moderate benefit in relation to Adjusted \(R^2\) and Cross Validation.

We will evaluate the same model one again after removing the identified large hat values.

remove_hat_dataframe = removeLargeHatValues(model_nine, new_dataframe)
model_ten = lm(KneeScore_Patient ~ . - Diagnosis - Race - TibialInsertType - FemoralComponentModel - PatellaModel - TibialTrayModel - TibialTraySize - FemoralComponentType, data = remove_hat_dataframe)
evaluate(model_ten)
##                   Test       Result
## 1  Shapiro Wilk pvalue 1.502033e-13
## 2 Breusch-Pagan pvalue 0.0001416128
## 3            R Squared    0.2753799
## 4        Adj R Squared    0.2592862
## 5     Cross Validation     16.92845
## 6     Large Hat Values           94
<<<<<<< HEAD

=======

>>>>>>> acc04d98d5bfe4ca322513e908756d502a7488a5

Removing the large hat values decreased the Cross Validation score. Perhaps we can find a better model by running BIC on the modified dataframe.

n = length(resid(model_ten))
model_10_bic = step(model_ten, direction = "backward", k = log(n), trace = 0)
evaluate(model_10_bic)
##                   Test       Result
## 1  Shapiro Wilk pvalue 1.298166e-14
## 2 Breusch-Pagan pvalue 1.347535e-06
## 3            R Squared    0.2491407
## 4        Adj R Squared    0.2470959
## 5     Cross Validation     16.91151
## 6     Large Hat Values          101
<<<<<<< HEAD

=======

>>>>>>> acc04d98d5bfe4ca322513e908756d502a7488a5

Despite the reduction in leveraged values the selected model is no better than preceeding models.

sample_idx_two = sample(1:nrow(remove_hat_dataframe), 1300)
train_data_two = remove_hat_dataframe[sample_idx_two, ]
test_data_two = remove_hat_dataframe[-sample_idx_two, ]

test_mod_10 = rmse(test_data_two$KneeScore_Patient, predict(model_ten, train_data_two))
train_mod_10 = rmse(train_data_two$KneeScore_Patient, predict(model_ten, train_data_two))

test_10_bic = rmse(test_data_two$KneeScore_Patient, predict(model_10_bic, train_data_two))
train_10_bic = rmse(train_data_two$KneeScore_Patient, predict(model_10_bic, train_data_two))
<<<<<<< HEAD ======= >>>>>>> acc04d98d5bfe4ca322513e908756d502a7488a5
Model Train RMSE Test RMSE
Model 1015.9629066 24.1098868
Model 10 BIC 16.2865348 23.872716.5482181 22.3038911
Model 10 BIC 16.8429739 22.1294134
<<<<<<< HEAD
=======
>>>>>>> acc04d98d5bfe4ca322513e908756d502a7488a5

Appendix

Knee Score Patient

model = lm(KneeScore_Patient ~. , data = knee)
model_back_aic = step(model, direction = "backward", trace = 0)
model_for_aic = step(model, direction = "forward", trace = 0)
model_both_aic = step(model, direction = "both", trace = 0)

n = length(resid(model))
model_back_bic = step(model, direction = "backward", k = log(n), trace = 0)
model_for_bic = step(model, direction = "forward", k = log(n), trace = 0)
model_both_bic = step(model, direction = "both", k = log(n), trace = 0)
evaluate(model)
##                   Test       Result
## 1  Shapiro Wilk pvalue 1.941649e-14
## 2 Breusch-Pagan pvalue    0.9539336
## 3            R Squared    0.3161338
## 4        Adj R Squared    0.2609288
## 5     Cross Validation     17.07365
## 6     Large Hat Values          178
<<<<<<< HEAD

=======

>>>>>>> acc04d98d5bfe4ca322513e908756d502a7488a5
evaluate(model_back_aic)
##                   Test       Result
## 1  Shapiro Wilk pvalue 1.123084e-14
## 2 Breusch-Pagan pvalue 2.237347e-05
## 3            R Squared    0.2562632
## 4        Adj R Squared    0.2490128
## 5     Cross Validation     16.85518
## 6     Large Hat Values          147
<<<<<<< HEAD

=======

>>>>>>> acc04d98d5bfe4ca322513e908756d502a7488a5
evaluate(model_for_aic)
##                   Test       Result
## 1  Shapiro Wilk pvalue 1.941649e-14
## 2 Breusch-Pagan pvalue    0.9539336
## 3            R Squared    0.3161338
## 4        Adj R Squared    0.2609288
## 5     Cross Validation     17.07365
## 6     Large Hat Values          178
<<<<<<< HEAD

=======

>>>>>>> acc04d98d5bfe4ca322513e908756d502a7488a5
evaluate(model_both_aic)
##                   Test       Result
## 1  Shapiro Wilk pvalue 1.123084e-14
## 2 Breusch-Pagan pvalue 2.237347e-05
## 3            R Squared    0.2562632
## 4        Adj R Squared    0.2490128
## 5     Cross Validation     16.85518
## 6     Large Hat Values          147
<<<<<<< HEAD

=======

>>>>>>> acc04d98d5bfe4ca322513e908756d502a7488a5
## Adj.R.Squared: 0.24 - CrossValidation: 16.9
evaluate(model_back_bic)
##                   Test       Result
## 1  Shapiro Wilk pvalue 1.934109e-15
## 2 Breusch-Pagan pvalue 7.743929e-07
## 3            R Squared    0.2423439
## 4        Adj R Squared    0.2404141
## 5     Cross Validation     16.91176
## 6     Large Hat Values          112
<<<<<<< HEAD

=======

>>>>>>> acc04d98d5bfe4ca322513e908756d502a7488a5
evaluate(model_for_bic)
##                   Test       Result
## 1  Shapiro Wilk pvalue 1.941649e-14
## 2 Breusch-Pagan pvalue    0.9539336
## 3            R Squared    0.3161338
## 4        Adj R Squared    0.2609288
## 5     Cross Validation     17.07365
## 6     Large Hat Values          178
<<<<<<< HEAD

=======

>>>>>>> acc04d98d5bfe4ca322513e908756d502a7488a5
evaluate(model_both_bic)
##                   Test       Result
## 1  Shapiro Wilk pvalue 1.934109e-15
## 2 Breusch-Pagan pvalue 7.743929e-07
## 3            R Squared    0.2423439
## 4        Adj R Squared    0.2404141
## 5     Cross Validation     16.91176
## 6     Large Hat Values          112
<<<<<<< HEAD

=======

>>>>>>> acc04d98d5bfe4ca322513e908756d502a7488a5
## Remove large leverages
new_data = removeInfluential(model, knee)
model_without_large_leverages = lm(KneeScore_Patient ~ ., data = new_data)
evaluate(model_without_large_leverages)
##                   Test       Result
## 1  Shapiro Wilk pvalue 1.061121e-14
## 2 Breusch-Pagan pvalue    0.9760654
## 3            R Squared    0.3171425
## 4        Adj R Squared    0.2622432
## 5     Cross Validation     16.88624
## 6     Large Hat Values          174
<<<<<<< HEAD

Removing the records with large hat values and refitting simply introduces more large hat values. What is unique about these rows of data?

which(hatvalues(model) == 1.0)
##   17   19   74   80  240  244  258  354  400  533  638  703  763  876  881 
##   17   19   74   80  240  244  258  354  400  533  638  703  763  876  881 
##  891 1047 1093 1096 1245 1374 1406 1544 1554 1686 1775 1778 1781 1919 1927 
##  891 1047 1093 1096 1245 1374 1406 1544 1554 1686 1775 1778 1781 1919 1927 
=======

Removing the records with large hat values and refitting simply introduces more large hat values. What is unique about these rows of data?

which(hatvalues(model) == 1.0)
##   17   19   74   80  240  244  258  354  400  533  638  703  763  844  876 
##   17   19   74   80  240  244  258  354  400  533  638  703  763  844  876 
##  881  891 1093 1096 1245 1374 1406 1544 1554 1686 1775 1778 1781 1919 1927 
##  881  891 1093 1096 1245 1374 1406 1544 1554 1686 1775 1778 1781 1919 1927 
>>>>>>> acc04d98d5bfe4ca322513e908756d502a7488a5
## 1938 
## 1938
print(knee[17,])
##    Surgeon Year Age Gender Weight Height   BMI      Diagnosis  Race Side
## 17       3 2010  68 Female    148     64 25.43 Osteoarthritis White Left
##    KneeScore_Surgeon KneeScore_Patient GPS Manufacturer
## 17                48                50  No  DePuy (J&J)
##    FemoralComponentModel FemoralComponentSize FemoralComponentType
## 17             Nonporous                    2   cruciate retaining
##    TibialTrayModel TibialTraySize TibialInsertModel TibialInsertWidth
## 17       PFC Sigma            2.5         PFC Sigma                 8
##    TibialInsertType      PatellaModel PatellaDiameter
## 17           curved 3-Post Round Dome              32
print(knee[240,])
##     Surgeon Year Age Gender Weight Height   BMI      Diagnosis  Race
## 240       3 2010  66 Female    130     68 19.78 Osteoarthritis White
##          Side KneeScore_Surgeon KneeScore_Patient GPS Manufacturer
## 240 Bilateral                60                80  No       Zimmer
##     FemoralComponentModel FemoralComponentSize FemoralComponentType
## 240        Zimmer Precoat                  E -   cruciate retaining
##     TibialTrayModel TibialTraySize TibialInsertModel TibialInsertWidth
## 240          Nexgen              4            Nexgen                10
##     TibialInsertType PatellaModel PatellaDiameter
## 240         standard       NexGen              32
print(knee[638,])
##     Surgeon Year Age Gender Weight Height   BMI      Diagnosis  Race  Side
## 638       3 2012  67 Female    167     62 30.57 Osteoarthritis White Right
##     KneeScore_Surgeon KneeScore_Patient GPS Manufacturer
## 638                55                55 Yes  DePuy (J&J)
##     FemoralComponentModel FemoralComponentSize FemoralComponentType
## 638                 Sigma                   4N   cruciate retaining
##     TibialTrayModel TibialTraySize TibialInsertModel TibialInsertWidth
## 638       PFC Sigma              3         PFC Sigma                 8
##            TibialInsertType                 PatellaModel PatellaDiameter
## 638 high flex,Fixed bearing Triathlon Asymmetric Patella              35
print(knee[1093,])
##      Surgeon Year Age Gender Weight Height   BMI      Diagnosis  Race
## 1093       3 2013  83 Female    130     63 23.05 Osteoarthritis White
##       Side KneeScore_Surgeon KneeScore_Patient GPS Manufacturer
## 1093 Right                66                50  No  DePuy (J&J)
##      FemoralComponentModel FemoralComponentSize FemoralComponentType
## 1093                 Sigma                    3 posterior stabilized
##      TibialTrayModel TibialTraySize TibialInsertModel TibialInsertWidth
## 1093       PFC Sigma            2.5         PFC Sigma                 8
##                 TibialInsertType            PatellaModel PatellaDiameter
## 1093 posterior stabilized,curved Oval Dome Patella 3-peg              35
print(knee[1775, ])
##      Surgeon Year Age Gender Weight Height   BMI   Diagnosis  Race  Side
## 1775       3 2014  22   Male    240     77 28.49 Post-trauma White Right
##      KneeScore_Surgeon KneeScore_Patient GPS Manufacturer
## 1775                32                65  No       Zimmer
##      FemoralComponentModel FemoralComponentSize FemoralComponentType
## 1775               Persona                   12   cruciate retaining
##      TibialTrayModel TibialTraySize TibialInsertModel TibialInsertWidth
## 1775         Persona              H           Persona                10
##      TibialInsertType PatellaModel PatellaDiameter
## 1775    Fixed bearing      Persona              38

Each of these has a suspect factor entry. Perhaps the entry only occurs infrequently in the database and is causing the entire row to be flagged as high leverage? Try pulling the suspect columns out of the LM and see what happens.

model_nine = lm(KneeScore_Patient ~ . - TibialInsertType - FemoralComponentModel - PatellaModel - TibialTrayModel - FemoralComponentType, data = knee)
evaluate(model_nine)
##                   Test      Result
## 1  Shapiro Wilk pvalue 2.75403e-14
## 2 Breusch-Pagan pvalue  0.08531339
## 3            R Squared   0.2986236
## 4        Adj R Squared    0.259491
## 5     Cross Validation    16.96213
## 6     Large Hat Values         137
<<<<<<< HEAD

=======

>>>>>>> acc04d98d5bfe4ca322513e908756d502a7488a5